Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(ci): Switch to PR Reviews for triggering CI #20892

Merged
merged 3 commits into from
Jul 30, 2024

Conversation

jszwedko
Copy link
Member

Allows us to capture the correct SHA to run CI with.

Allows us to capture the correct SHA to run CI with.

Signed-off-by: Jesse Szwedko <[email protected]>
@jszwedko jszwedko added the no-changelog Changes in this PR do not need user-facing explanations in the release changelog label Jul 19, 2024
@jszwedko jszwedko marked this pull request as ready for review July 19, 2024 18:17
@datadog-vectordotdev
Copy link

datadog-vectordotdev bot commented Jul 19, 2024

Datadog Report

Branch report: jszwedko/switch-to-pr-review-comments
Commit report: c104210
Test service: vector

✅ 0 Failed, 7 Passed, 0 Skipped, 25.48s Total Time

.github/workflows/comment-trigger.yml Show resolved Hide resolved
.github/workflows/misc.yml Outdated Show resolved Hide resolved
Signed-off-by: Jesse Szwedko <[email protected]>
@jszwedko jszwedko requested a review from neuronull July 19, 2024 20:21
Copy link

@xopham xopham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the liberty to take a bit of an offensive view on things and made a few comments on parts not directly focus of the PR. I hope you don't mind. Feel free to consider these things in a separate effort in order to not blow up this PR.

I'd like to bring up an architecture question:

I think we are just developing an idea what secure coding for GitHub workflows might mean, but I took some inspiration from REST API security considerations.

Currently, we validate the comment author, check at the later action whether it was a PR review and select the GitHub reference for checkout. This creates a loose link between what was validated and what we run on. This is not necessarily vulnerable, but more likely to cause problems in the future (TOCTU- or parsing-confusion-type vulnerabilities).

Therefore, we might want to avoid such loosely linked references to validation. We could create a validate action that directly emits validated command and GitHub reference as outputs and have these consumed as inputs by subsequent steps, reusable workflows or composite actions thereafter.
What do you think?

Comment on lines 56 to 66
contains(github.event.review.body, '/ci-run-all')
|| contains(github.event.review.body, '/ci-run-cli')
|| contains(github.event.review.body, '/ci-run-misc')
|| contains(github.event.review.body, '/ci-run-deny')
|| contains(github.event.review.body, '/ci-run-component-features')
|| contains(github.event.review.body, '/ci-run-cross')
|| contains(github.event.review.body, '/ci-run-unit-mac')
|| contains(github.event.review.body, '/ci-run-unit-windows')
|| contains(github.event.review.body, '/ci-run-environment')
|| contains(github.event.review.body, '/ci-run-regression')
|| contains(github.event.review.body, '/ci-run-k8s')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to make this a startsWith to avoid accidental actions when e.g. including suggestions or other coincidental mentions?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 that seems reasonable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in d97fa61

Comment on lines -72 to -79
- name: Get PR author
id: pr
uses: tspascoal/get-user-teams-membership@v3
with:
username: ${{ github.event.issue.user.login }}
team: 'Vector'
GITHUB_TOKEN: ${{ steps.generate_token.outputs.token }}
- name: Get PR comment author
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat. Good to remove because this could have technically been vulnerable as others might push to another one's PR 💯

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true though it would have required:

  • A maintainer to create a fork
  • The maintainer to give a non-maintainer write access to their fork
  • The maintainer to create a PR from their fork

Typically maintainers create branches in the main repository. Agreed though, it is nice to get rid of.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or steal a GitHub token with write permissions and push changes to a maintainer's branch and merge via comment 😉

Comment on lines 97 to 99
if: ${{ contains(github.event.review.body, '/ci-run-integration-amqp')
|| contains(github.event.review.body, '/ci-run-integration-all')
|| contains(github.event.review.body, '/ci-run-all') }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense here and below to make this a startsWith to avoid accidental actions when e.g. including suggestions or other coincidental mentions? Essentially "enforcing" intention to run this and avoiding unintentional execution.

If yes, needs to be changed here and below. I mentioned at one other spot, but occurs throughout the workflows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I'll change this to startsWith.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in d97fa61

@@ -112,109 +105,109 @@ jobs:
command: bash scripts/ci-int-e2e-test.sh int amqp
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the core of this PR, but related to the original reason ("vulnerable GH workflows"): given the architecture of these invocations, you might want add scripts/ and the .github/CODEOWNERS themselves to CODEOWNERS: https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners#codeowners-and-branch-protection

Obviously depends on how externals can become contributors and whether there is review on every change from external.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this one. Someone on @vectordotdev/vector reviews all external changes regardless. It is reasonable enough to add ownership for all files though. I'll open a separate PR for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #20947

Comment on lines -13 to +29
- name: (PR comment) Get PR branch
if: ${{ github.event_name == 'issue_comment' }}
uses: xt0rted/pull-request-comment-branch@v2
id: comment-branch

- name: (PR comment) Set latest commit status as pending
if: ${{ github.event_name == 'issue_comment' }}
- name: (PR review) Set latest commit status as pending
if: ${{ github.event_name == 'pull_request_review' }}
uses: myrotvorets/[email protected]
with:
sha: ${{ steps.comment-branch.outputs.head_sha }}
sha: ${{ github.events.review.commit_id }}
token: ${{ secrets.GITHUB_TOKEN }}
context: CLI - Linux
status: pending

- name: (PR comment) Checkout PR branch
if: ${{ github.event_name == 'issue_comment' }}
- name: (PR review) Checkout review SHA
if: ${{ github.event_name == 'pull_request_review' }}
uses: actions/checkout@v3
with:
ref: ${{ steps.comment-branch.outputs.head_ref }}
ref: ${{ github.events.review.commit_id }}

- name: Checkout branch
if: ${{ github.event_name != 'issue_comment' }}
if: ${{ github.event_name != 'pull_request_review' }}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could avoid this construct and create a more hard-linked reference: set an input to this reusable workflow to provide the ref, emit a "validated commit ID" in the validate job and pass it as input to these workflows.

the pattern is repeated in several reusable workflows.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I did notice this boilerplate repeated across workflows that could be DRY'd up, but didn't want to attempt that refactoring with this change.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, sounds good to me

# appears to be due to the potential length of it.
#concurrency:
# group: ${{ github.workflow }}-${{ github.event.issue.id }}-${{ github.event.comment.body }}
# group: ${{ github.workflow }}-${{ github.event.issue.id }}-${{ github.event.review.body }}
# cancel-in-progress: true

jobs:
validate:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need a step here to validate whether the commit it's running on is the latest in the PR? Some labelling and other actions might give a false sense of what has been tested...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is necessary that the commit be the latest commit in the PR. The workflows will run again as part of the merge queue so if there are breakages they will be caught then. This comment trigger workflow is just to have the ability to trigger them without adding the PR to the merge queue.

An exception is the component features check which doesn't run on the merge queue, instead it runs nightly, but breakages to that are low urgency and can be resolved later.

@jszwedko
Copy link
Member Author

Currently, we validate the comment author, check at the later action whether it was a PR review and select the GitHub reference for checkout. This creates a loose link between what was validated and what we run on. This is not necessarily vulnerable, but more likely to cause problems in the future (TOCTU- or parsing-confusion-type vulnerabilities).

Therefore, we might want to avoid such loosely linked references to validation. We could create a validate action that directly emits validated command and GitHub reference as outputs and have these consumed as inputs by subsequent steps, reusable workflows or composite actions thereafter. What do you think?

I agree, it does seem like this could be refactored such that the comment triggers provide input to included workflows that includes the ref to checkout. This would also let us delete some of the duplicate steps across the triggered workflows. I'd like to leave this as a follow-up though if it is ok with you.

@jszwedko jszwedko requested a review from xopham July 26, 2024 20:41
Copy link
Contributor

@neuronull neuronull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link

@xopham xopham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for sharing your insights. That's very helpful 🚀

@jszwedko jszwedko added this pull request to the merge queue Jul 30, 2024
Merged via the queue into master with commit 6e803a3 Jul 30, 2024
156 checks passed
@jszwedko jszwedko deleted the jszwedko/switch-to-pr-review-comments branch July 30, 2024 19:05
ym pushed a commit to ym/vector that referenced this pull request Aug 18, 2024
* chore(ci): Switch to PR Reviews for triggering CI

Allows us to capture the correct SHA to run CI with.

Signed-off-by: Jesse Szwedko <[email protected]>

* PR feedback

Signed-off-by: Jesse Szwedko <[email protected]>

* use startsWith rather than contains

Signed-off-by: Jesse Szwedko <[email protected]>

---------

Signed-off-by: Jesse Szwedko <[email protected]>
AndrooTheChen pushed a commit to discord/vector that referenced this pull request Sep 23, 2024
* chore(ci): Switch to PR Reviews for triggering CI

Allows us to capture the correct SHA to run CI with.

Signed-off-by: Jesse Szwedko <[email protected]>

* PR feedback

Signed-off-by: Jesse Szwedko <[email protected]>

* use startsWith rather than contains

Signed-off-by: Jesse Szwedko <[email protected]>

---------

Signed-off-by: Jesse Szwedko <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
no-changelog Changes in this PR do not need user-facing explanations in the release changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants